Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
bioRxiv ; 2024 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-38293115

RESUMO

Here, we describe the "Obelisks," a previously unrecognised class of viroid-like elements that we first identified in human gut metatranscriptomic data. "Obelisks" share several properties: (i) apparently circular RNA ~1kb genome assemblies, (ii) predicted rod-like secondary structures encompassing the entire genome, and (iii) open reading frames coding for a novel protein superfamily, which we call the "Oblins". We find that Obelisks form their own distinct phylogenetic group with no detectable sequence or structural similarity to known biological agents. Further, Obelisks are prevalent in tested human microbiome metatranscriptomes with representatives detected in ~7% of analysed stool metatranscriptomes (29/440) and in ~50% of analysed oral metatranscriptomes (17/32). Obelisk compositions appear to differ between the anatomic sites and are capable of persisting in individuals, with continued presence over >300 days observed in one case. Large scale searches identified 29,959 Obelisks (clustered at 90% nucleotide identity), with examples from all seven continents and in diverse ecological niches. From this search, a subset of Obelisks are identified to code for Obelisk-specific variants of the hammerhead type-III self-cleaving ribozyme. Lastly, we identified one case of a bacterial species (Streptococcus sanguinis) in which a subset of defined laboratory strains harboured a specific Obelisk RNA population. As such, Obelisks comprise a class of diverse RNAs that have colonised, and gone unnoticed in, human, and global microbiomes.

2.
Nat Commun ; 14(1): 2591, 2023 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-37147358

RESUMO

Earth's life may have originated as self-replicating RNA, and it has been argued that RNA viruses and viroid-like elements are remnants of such pre-cellular RNA world. RNA viruses are defined by linear RNA genomes encoding an RNA-dependent RNA polymerase (RdRp), whereas viroid-like elements consist of small, single-stranded, circular RNA genomes that, in some cases, encode paired self-cleaving ribozymes. Here we show that the number of candidate viroid-like elements occurring in geographically and ecologically diverse niches is much higher than previously thought. We report that, amongst these circular genomes, fungal ambiviruses are viroid-like elements that undergo rolling circle replication and encode their own viral RdRp. Thus, ambiviruses are distinct infectious RNAs showing hybrid features of viroid-like RNAs and viruses. We also detected similar circular RNAs, containing active ribozymes and encoding RdRps, related to mitochondrial-like fungal viruses, highlighting fungi as an evolutionary hub for RNA viruses and viroid-like elements. Our findings point to a deep co-evolutionary history between RNA viruses and subviral elements and offer new perspectives in the origin and evolution of primordial infectious agents, and RNA life.


Assuntos
Vírus de RNA , RNA Catalítico , Viroides , Viroides/genética , RNA Catalítico/genética , RNA Viral/genética , Replicação Viral/genética , RNA/genética , Vírus de RNA/genética , RNA Polimerase Dependente de RNA/genética , Fungos/genética
3.
Nat Commun ; 13(1): 6968, 2022 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-36379955

RESUMO

Multiple sequence alignments are widely used to infer evolutionary relationships, enabling inferences of structure, function, and phylogeny. Standard practice is to construct one alignment by some preferred method and use it in further analysis; however, undetected alignment bias can be problematic. I describe Muscle5, a novel algorithm which constructs an ensemble of high-accuracy alignment with diverse biases by perturbing a hidden Markov model and permuting its guide tree. Confidence in an inference is assessed as the fraction of the ensemble which supports it. Applied to phylogenetic tree estimation, I show that ensembles can confidently resolve topologies with low bootstrap according to standard methods, and conversely that some topologies with high bootstraps are incorrect. Applied to the phylogeny of RNA viruses, ensemble analysis shows that recently adopted taxonomic phyla are probably polyphyletic. Ensemble analysis can improve confidence assessment in any inference from an alignment.


Assuntos
Algoritmos , Evolução Biológica , Filogenia , Alinhamento de Sequência , Homologia de Sequência
4.
Pathogens ; 11(7)2022 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-35890050

RESUMO

Conventionally, hyperimmune globulin drugs manufactured from pooled immunoglobulins from vaccinated or convalescent donors have been used in treating infections where no treatment is available. This is especially important where multi-epitope neutralization is required to prevent the development of immune-evading viral mutants that can emerge upon treatment with monoclonal antibodies. Using microfluidics, flow sorting, and a targeted integration cell line, a first-in-class recombinant hyperimmune globulin therapeutic against SARS-CoV-2 (GIGA-2050) was generated. Using processes similar to conventional monoclonal antibody manufacturing, GIGA-2050, comprising 12,500 antibodies, was scaled-up for clinical manufacturing and multiple development/tox lots were assessed for consistency. Antibody sequence diversity, cell growth, productivity, and product quality were assessed across different manufacturing sites and production scales. GIGA-2050 was purified and tested for good laboratory procedures (GLP) toxicology, pharmacokinetics, and in vivo efficacy against natural SARS-CoV-2 infection in mice. The GIGA-2050 master cell bank was highly stable, producing material at consistent yield and product quality up to >70 generations. Good manufacturing practices (GMP) and development batches of GIGA-2050 showed consistent product quality, impurity clearance, potency, and protection in an in vivo efficacy model. Nonhuman primate toxicology and pharmacokinetics studies suggest that GIGA-2050 is safe and has a half-life similar to other recombinant human IgG1 antibodies. These results supported a successful investigational new drug application for GIGA-2050. This study demonstrates that a new class of drugs, recombinant hyperimmune globulins, can be manufactured consistently at the clinical scale and presents a new approach to treating infectious diseases that targets multiple epitopes of a virus.

5.
Nature ; 602(7895): 142-147, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-35082445

RESUMO

Public databases contain a planetary collection of nucleic acid sequences, but their systematic exploration has been inhibited by a lack of efficient methods for searching this corpus, which (at the time of writing) exceeds 20 petabases and is growing exponentially1. Here we developed a cloud computing infrastructure, Serratus, to enable ultra-high-throughput sequence alignment at the petabase scale. We searched 5.7 million biologically diverse samples (10.2 petabases) for the hallmark gene RNA-dependent RNA polymerase and identified well over 105 novel RNA viruses, thereby expanding the number of known species by roughly an order of magnitude. We characterized novel viruses related to coronaviruses, hepatitis delta virus and huge phages, respectively, and analysed their environmental reservoirs. To catalyse the ongoing revolution of viral discovery, we established a free and comprehensive database of these data and tools. Expanding the known sequence diversity of viruses can reveal the evolutionary origins of emerging pathogens and improve pathogen surveillance for the anticipation and mitigation of future pandemics.


Assuntos
Computação em Nuvem , Bases de Dados Genéticas , Vírus de RNA/genética , Vírus de RNA/isolamento & purificação , Alinhamento de Sequência/métodos , Virologia/métodos , Viroma/genética , Animais , Arquivos , Bacteriófagos/enzimologia , Bacteriófagos/genética , Biodiversidade , Coronavirus/classificação , Coronavirus/enzimologia , Coronavirus/genética , Evolução Molecular , Vírus Delta da Hepatite/enzimologia , Vírus Delta da Hepatite/genética , Humanos , Modelos Moleculares , Vírus de RNA/classificação , Vírus de RNA/enzimologia , RNA Polimerase Dependente de RNA/química , RNA Polimerase Dependente de RNA/genética , Software
6.
Nat Biotechnol ; 39(8): 989-999, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-33859400

RESUMO

Plasma-derived polyclonal antibody therapeutics, such as intravenous immunoglobulin, have multiple drawbacks, including low potency, impurities, insufficient supply and batch-to-batch variation. Here we describe a microfluidics and molecular genomics strategy for capturing diverse mammalian antibody repertoires to create recombinant multivalent hyperimmune globulins. Our method generates of diverse mixtures of thousands of recombinant antibodies, enriched for specificity and activity against therapeutic targets. Each hyperimmune globulin product comprised thousands to tens of thousands of antibodies derived from convalescent or vaccinated human donors or from immunized mice. Using this approach, we generated hyperimmune globulins with potent neutralizing activity against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) in under 3 months, Fc-engineered hyperimmune globulins specific for Zika virus that lacked antibody-dependent enhancement of disease, and hyperimmune globulins specific for lung pathogens present in patients with primary immune deficiency. To address the limitations of rabbit-derived anti-thymocyte globulin, we generated a recombinant human version and demonstrated its efficacy in mice against graft-versus-host disease.


Assuntos
Linfócitos B/imunologia , COVID-19/terapia , Globulinas/biossíntese , SARS-CoV-2/imunologia , Animais , Anticorpos Antivirais/imunologia , Células CHO , Cricetulus , Ensaio de Imunoadsorção Enzimática , Globulinas/imunologia , Humanos , Imunização Passiva , Camundongos , Proteínas Recombinantes/biossíntese , Proteínas Recombinantes/imunologia , Zika virus/imunologia , Soroterapia para COVID-19
7.
Nat Biotechnol ; 38(5): 609-619, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32393905

RESUMO

T cells engineered to express antigen-specific T cell receptors (TCRs) are potent therapies for viral infections and cancer. However, efficient identification of clinical candidate TCRs is complicated by the size and complexity of T cell repertoires and the challenges of working with primary T cells. Here we present a high-throughput method to identify TCRs with high functional avidity from diverse human T cell repertoires. The approach used massively parallel microfluidics to generate libraries of natively paired, full-length TCRαß clones, from millions of primary T cells, which were then expressed in Jurkat cells. The TCRαß-Jurkat libraries enabled repeated screening and panning for antigen-reactive TCRs using peptide major histocompatibility complex binding and cellular activation. We captured more than 2.9 million natively paired TCRαß clonotypes from six healthy human donors and identified rare (<0.001% frequency) viral-antigen-reactive TCRs. We also mined a tumor-infiltrating lymphocyte sample from a patient with melanoma and identified several tumor-specific TCRs, which, after expression in primary T cells, led to tumor cell killing.


Assuntos
Antígenos/análise , Receptores de Antígenos de Linfócitos T alfa-beta/imunologia , Linfócitos T/citologia , Engenharia Celular , Biblioteca Gênica , Humanos , Células Jurkat , Linfócitos do Interstício Tumoral/imunologia , Melanoma/imunologia , Linfócitos T/imunologia , Vírus/imunologia
8.
Antibodies (Basel) ; 8(1)2019 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-31544823

RESUMO

To discover therapeutically relevant antibody candidates, many groups use mouse immunization followed by hybridoma generation or B cell screening. One modern approach is to screen B cells by generating natively paired single chain variable fragment (scFv) display libraries in yeast. Such methods typically rely on soluble antigens for scFv library screening. However, many therapeutically relevant cell-surface targets are difficult to express in a soluble protein format, complicating discovery. In this study, we developed methods to screen humanized mouse-derived yeast scFv libraries using recombinant OX40 protein in cell lysate. We used deep sequencing to compare screening with cell lysate to screening with soluble OX40 protein, in the context of mouse immunizations using either soluble OX40 or OX40-expressing cells and OX40-encoding DNA vector. We found that all tested methods produce a unique diversity of scFv binders. However, when we reformatted forty-one of these scFv as full-length monoclonal antibodies (mAbs), we observed that mAbs identified using soluble antigen immunization with cell lysate sorting always bound cell surface OX40, whereas other methods had significant false positive rates. Antibodies identified using soluble antigen immunization and cell lysate sorting were also significantly more likely to activate OX40 in a cellular assay. Our data suggest that sorting with OX40 protein in cell lysate is more likely than other methods to retain the epitopes required for antibody-mediated OX40 agonism.

9.
MAbs ; 11(5): 870-883, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30898066

RESUMO

Immunization of mice followed by hybridoma or B-cell screening is one of the most common antibody discovery methods used to generate therapeutic monoclonal antibody (mAb) candidates. There are a multitude of different immunization protocols that can generate an immune response in animals. However, an extensive analysis of the antibody repertoires that these alternative immunization protocols can generate has not been performed. In this study, we immunized mice that transgenically express human antibodies with either programmed cell death 1 protein or cytotoxic T-lymphocyte associated protein 4 using four different immunization protocols, and then utilized a single cell microfluidic platform to generate tissue-specific, natively paired immunoglobulin (Ig) repertoires from each method and enriched for target-specific binders using yeast single-chain variable fragment (scFv) display. We deep sequenced the scFv repertoires from both the pre-sort and post-sort libraries. All methods and both targets yielded similar oligoclonality, variable (V) and joining (J) gene usage, and divergence from germline of enriched libraries. However, there were differences between targets and/or immunization protocols for overall clonal counts, complementarity-determining region 3 (CDR3) length, and antibody/CDR3 sequence diversity. Our data suggest that, although different immunization protocols may generate a response to an antigen, performing multiple immunization protocols in parallel can yield greater Ig diversity. We conclude that modern microfluidic methods, followed by an extensive molecular genomic analysis of antibody repertoires, can be used to quickly analyze new immunization protocols or mouse platforms.


Assuntos
Anticorpos Monoclonais Humanizados/genética , Diversidade de Anticorpos , Imunização/métodos , Microfluídica/métodos , Animais , Anticorpos Monoclonais Humanizados/imunologia , Linfócitos B/imunologia , Antígeno CTLA-4/imunologia , Regiões Determinantes de Complementaridade/genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Hibridomas , Camundongos , Camundongos Transgênicos , Biblioteca de Peptídeos , Receptor de Morte Celular Programada 1/imunologia , Anticorpos de Cadeia Única/genética , Anticorpos de Cadeia Única/imunologia
10.
PeerJ ; 6: e4652, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29682424

RESUMO

Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA) is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%), all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal.

11.
Bioinformatics ; 34(14): 2371-2375, 2018 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-29506021

RESUMO

Motivation: The 16S ribosomal RNA (rRNA) gene is widely used to survey microbial communities. Sequences are often clustered into Operational Taxonomic Units (OTUs) as proxies for species. The canonical clustering threshold is 97% identity, which was proposed in 1994 when few 16S rRNA sequences were available, motivating a reassessment on current data. Results: Using a large set of high-quality 16S rRNA sequences from finished genomes, I assessed the correspondence of OTUs to species for five representative clustering algorithms using four accuracy metrics. All algorithms had comparable accuracy when tuned to a given metric. Optimal identity thresholds were ∼99% for full-length sequences and ∼100% for the V4 hypervariable region. Availability and implementation: Reference sequences and source code are provided in the Supplementary Material. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Genes de RNAr , Microbiota/genética , RNA Ribossômico 16S/genética , Análise de Sequência de DNA/métodos , Software , Algoritmos , Análise por Conglomerados
12.
MAbs ; 10(3): 431-443, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29376776

RESUMO

Deep sequencing and single-chain variable fragment (scFv) yeast display methods are becoming more popular for discovery of therapeutic antibody candidates in mouse B cell repertoires. In this study, we compare a deep sequencing and scFv display method that retains native heavy and light chain pairing with a related method that randomly pairs heavy and light chain. We performed the studies in a humanized mouse, using interleukin 21 receptor (IL-21R) as a test immunogen. We identified 44 high-affinity binder scFv with the native pairing method and 100 high-affinity binder scFv with the random pairing method. 30% of the natively paired scFv binders were also discovered with the randomly paired method, and 13% of the randomly paired binders were also discovered with the natively paired method. Additionally, 33% of the scFv binders discovered only in the randomly paired library were initially present in the natively paired pre-sort library. Thus, a significant proportion of "randomly paired" scFv were actually natively paired. We synthesized and produced 46 of the candidates as full-length antibodies and subjected them to a panel of binding assays to characterize their therapeutic potential. 87% of the antibodies were verified as binding IL-21R by at least one assay. We found that antibodies with native light chains were more likely to bind IL-21R than antibodies with non-native light chains, suggesting a higher false positive rate for antibodies from the randomly paired library. Additionally, the randomly paired method failed to identify nearly half of the true natively paired binders, suggesting a higher false negative rate. We conclude that natively paired libraries have critical advantages in sensitivity and specificity for antibody discovery programs.


Assuntos
Linfócitos B/imunologia , Biblioteca Gênica , Cadeias Leves de Imunoglobulina , Subunidade alfa de Receptor de Interleucina-21 , Anticorpos de Cadeia Única , Animais , Humanos , Cadeias Leves de Imunoglobulina/biossíntese , Cadeias Leves de Imunoglobulina/genética , Cadeias Leves de Imunoglobulina/imunologia , Subunidade alfa de Receptor de Interleucina-21/antagonistas & inibidores , Subunidade alfa de Receptor de Interleucina-21/imunologia , Camundongos , Anticorpos de Cadeia Única/genética , Anticorpos de Cadeia Única/imunologia
13.
PeerJ ; 5: e3889, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29018622

RESUMO

Next-generation sequencing of 16S ribosomal RNA is widely used to survey microbial communities. Sequences are typically assigned to Operational Taxonomic Units (OTUs). Closed- and open-reference OTU assignment matches reads to a reference database at 97% identity (closed), then clusters unmatched reads using a de novo method (open). Implementations of these methods in the QIIME package were tested on several mock community datasets with 20 strains using different sequencing technologies and primers. Richness (number of reported OTUs) was often greatly exaggerated, with hundreds or thousands of OTUs generated on Illumina datasets. Between-sample diversity was also found to be highly exaggerated in many cases, with weighted Jaccard distances between identical mock samples often close to one, indicating very low similarity. Non-overlapping hyper-variable regions in 70% of species were assigned to different OTUs. On mock communities with Illumina V4 reads, 56% to 88% of predicted genus names were false positives. Biological inferences obtained using these methods are therefore not reliable.

14.
MAbs ; 9(8): 1282-1296, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28846502

RESUMO

Affinity-matured, functional anti-pathogen antibodies are present at low frequencies in natural human repertoires. These antibodies are often excellent candidates for therapeutic monoclonal antibodies. However, mining natural human antibody repertoires is a challenge. In this study, we demonstrate a new method that uses microfluidics, yeast display, and deep sequencing to identify 247 natively paired anti-pathogen single-chain variable fragments (scFvs), which were initially as rare as 1 in 100,000 in the human repertoires. Influenza A vaccination increased the frequency of influenza A antigen-binding scFv within the peripheral B cell repertoire from <0.1% in non-vaccinated donors to 0.3-0.4% in vaccinated donors, whereas pneumococcus vaccination did not increase the frequency of antigen-binding scFv. However, the pneumococcus scFv binders from the vaccinated library had higher heavy and light chain Replacement/Silent mutation (R/S) ratios, a measure of affinity maturation, than the pneumococcus binders from the corresponding non-vaccinated library. Thus, pneumococcus vaccination may increase the frequency of affinity-matured antibodies in human repertoires. We synthesized 10 anti-influenza A and nine anti-pneumococcus full-length antibodies that were highly abundant among antigen-binding scFv. All 10 anti-influenza A antibodies bound the appropriate antigen at KD<10 nM and neutralized virus in cellular assays. All nine anti-pneumococcus full-length antibodies bound at least one polysaccharide serotype, and 71% of the anti-pneumococcus antibodies that we tested were functional in cell killing assays. Our approach has future application in a variety of fields, including the development of therapeutic antibodies for emerging viral diseases, autoimmune disorders, and cancer.


Assuntos
Anti-Infecciosos/imunologia , Anticorpos Monoclonais/imunologia , Afinidade de Anticorpos/imunologia , Genômica/métodos , Microfluídica/métodos , Sequência de Aminoácidos , Anti-Infecciosos/administração & dosagem , Anti-Infecciosos/metabolismo , Anticorpos Monoclonais/administração & dosagem , Anticorpos Monoclonais/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Vírus da Influenza A/efeitos dos fármacos , Vírus da Influenza A/imunologia , Biblioteca de Peptídeos , Anticorpos de Cadeia Única/genética , Anticorpos de Cadeia Única/imunologia , Anticorpos de Cadeia Única/metabolismo , Streptococcus pneumoniae/efeitos dos fármacos , Streptococcus pneumoniae/imunologia
15.
MAbs ; 9(8): 1270-1281, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28846506

RESUMO

Conventionally, mouse hybridomas or well-plate screening are used to identify therapeutic monoclonal antibody candidates. In this study, we present an alternative to hybridoma-based discovery that combines microfluidics, yeast single-chain variable fragment (scFv) display, and deep sequencing to rapidly interrogate and screen mouse antibody repertoires. We used our approach on six wild-type mice to identify 269 molecules that bind to programmed cell death protein 1 (PD-1), which were present at an average of 1 in 2,000 in the pre-sort scFv libraries. Two rounds of fluorescence-activated cell sorting (FACS) produced populations of PD-1-binding scFv with a mean enrichment of 800-fold, whereas most scFv present in the pre-sort mouse repertoires were de-enriched. Therefore, our work suggests that most of the antibodies present in the repertoires of immunized mice are not strong binders to PD-1. We observed clusters of related antibody sequences in each mouse following FACS, suggesting evolution of clonal lineages. In the pre-sort repertoires, these putative clonal lineages varied in both the complementary-determining region (CDR)3K and CDR3H, while the FACS-selected PD-1-binding subsets varied primarily in the CDR3H. PD-1 binders were generally not highly diverged from germline, showing 98% identity on average with germline V-genes. Some CDR3 sequences were discovered in more than one animal, even across different mouse strains, suggesting convergent evolution. We synthesized 17 of the anti-PD-1 binders as full-length monoclonal antibodies. All 17 full-length antibodies bound recombinant PD-1 with KD < 500 nM (average = 62 nM). Fifteen of the 17 full-length antibodies specifically bound surface-expressed PD-1 in a FACS assay, and nine of the antibodies functioned as checkpoint inhibitors in a cellular assay. We conclude that our method is a viable alternative to hybridomas, with key advantages in comprehensiveness and turnaround time.


Assuntos
Anticorpos Monoclonais/imunologia , Afinidade de Anticorpos/imunologia , Genômica/métodos , Microfluídica/métodos , Receptor de Morte Celular Programada 1/imunologia , Animais , Anticorpos Monoclonais/metabolismo , Anticorpos Monoclonais/farmacologia , Regiões Determinantes de Complementaridade/genética , Regiões Determinantes de Complementaridade/imunologia , Regiões Determinantes de Complementaridade/metabolismo , Citometria de Fluxo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Hibridomas , Camundongos , Biblioteca de Peptídeos , Receptor de Morte Celular Programada 1/antagonistas & inibidores , Receptor de Morte Celular Programada 1/metabolismo , Ligação Proteica/imunologia , Anticorpos de Cadeia Única/genética , Anticorpos de Cadeia Única/imunologia , Anticorpos de Cadeia Única/metabolismo
17.
Bioinformatics ; 31(21): 3476-82, 2015 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-26139637

RESUMO

MOTIVATION: Next-generation sequencing produces vast amounts of data with errors that are difficult to distinguish from true biological variation when coverage is low. RESULTS: We demonstrate large reductions in error frequencies, especially for high-error-rate reads, by three independent means: (i) filtering reads according to their expected number of errors, (ii) assembling overlapping read pairs and (iii) for amplicon reads, by exploiting unique sequence abundances to perform error correction. We also show that most published paired read assemblers calculate incorrect posterior quality scores. AVAILABILITY AND IMPLEMENTATION: These methods are implemented in the USEARCH package. Binaries are freely available at http://drive5.com/usearch. CONTACT: robert@drive5.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Software
18.
Microbiome ; 3: 20, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25995836

RESUMO

BACKGROUND: The operational taxonomic unit (OTU) is widely used in microbial ecology. Reproducibility in microbial ecology research depends on the reliability of OTU-based 16S ribosomal subunit RNA (rRNA) analyses. RESULTS: Here, we report that many hierarchical and greedy clustering methods produce unstable OTUs, with membership that depends on the number of sequences clustered. If OTUs are regenerated with additional sequences or samples, sequences originally assigned to a given OTU can be split into different OTUs. Alternatively, sequences assigned to different OTUs can be merged into a single OTU. This OTU instability affects alpha-diversity analyses such as rarefaction curves, beta-diversity analyses such as distance-based ordination (for example, Principal Coordinate Analysis (PCoA)), and the identification of differentially represented OTUs. Our results show that the proportion of unstable OTUs varies for different clustering methods. We found that the closed-reference method is the only one that produces completely stable OTUs, with the caveat that sequences that do not match a pre-existing reference sequence collection are discarded. CONCLUSIONS: As a compromise to the factors listed above, we propose using an open-reference method to enhance OTU stability. This type of method clusters sequences against a database and includes unmatched sequences by clustering them via a relatively stable de novo clustering method. OTU stability is an important consideration when analyzing microbial diversity and is a feature that should be taken into account during the development of novel OTU clustering methods.

19.
Microbes Environ ; 30(2): 145-50, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25786896

RESUMO

The nuclear ribosomal internal transcribed spacer (ITS) region is the most commonly chosen genetic marker for the molecular identification of fungi in environmental sequencing and molecular ecology studies. Several analytical issues complicate such efforts, one of which is the formation of chimeric-artificially joined-DNA sequences during PCR amplification or sequence assembly. Several software tools are currently available for chimera detection, but rely to various degrees on the presence of a chimera-free reference dataset for optimal performance. However, no such dataset is available for use with the fungal ITS region. This study introduces a comprehensive, automatically updated reference dataset for fungal ITS sequences based on the UNITE database for the molecular identification of fungi. This dataset supports chimera detection throughout the fungal kingdom and for full-length ITS sequences as well as partial (ITS1 or ITS2 only) datasets. The performance of the dataset on a large set of artificial chimeras was above 99.5%, and we subsequently used the dataset to remove nearly 1,000 compromised fungal ITS sequences from public circulation. The dataset is available at http://unite.ut.ee/repository.php and is subject to web-based third-party curation.


Assuntos
Artefatos , DNA Fúngico/genética , DNA Espaçador Ribossômico/genética , Microbiologia Ambiental , Fungos/classificação , Metagenômica/métodos , Análise de Sequência de DNA , DNA Fúngico/química , DNA Espaçador Ribossômico/química , Fungos/genética , Padrões de Referência
20.
Nat Methods ; 10(10): 996-8, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23955772

RESUMO

Amplified marker-gene sequences can be used to understand microbial community structure, but they suffer from a high level of sequencing and amplification artifacts. The UPARSE pipeline reports operational taxonomic unit (OTU) sequences with ≤1% incorrect bases in artificial microbial community tests, compared with >3% incorrect bases commonly reported by other methods. The improved accuracy results in far fewer OTUs, consistently closer to the expected number of species in a community.


Assuntos
Microbiota/genética , Filogenia , RNA Ribossômico 16S/genética , Algoritmos , Bases de Dados Genéticas , Humanos , Metagenômica , Projetos de Pesquisa , Sensibilidade e Especificidade , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...